Review of approaches for paraphrase identification

نویسندگان

چکیده

The article is devoted to a review of approaches solving the problem identifying paraphrases. This problem's relevance and use in tasks such as plagiarism detection, text simplification, information search are described. Several classes solutions were considered. first approach based on manual rules - it uses manually selected features fundamental properties second lexical similarity various databases ontologies. Machine learning-based also presented this paper describe different architectures that can be used identify last considered deep learning modern models transformers.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Paraphrase Identification Corpora

We analyze in this paper a number of data sets proposed over the last decade or so for the task of paraphrase identification. The goal of the analysis is to identify the advantages as well as shortcomings of the previously proposed data sets. Based on the analysis, we then make recommendations about how to improve the process of creating and using such data sets for evaluating in the future app...

متن کامل

Convolutional Neural Network for Paraphrase Identification

We present a new deep learning architecture Bi-CNN-MI for paraphrase identification (PI). Based on the insight that PI requires comparing two sentences on multiple levels of granularity, we learn multigranular sentence representations using convolutional neural network (CNN) and model interaction features at each level. These features are then the input to a logistic classifier for PI. All para...

متن کامل

Discriminative Phrase Embedding for Paraphrase Identification

This work, concerning paraphrase identification task, on one hand contributes to expanding deep learning embeddings to include continuous and discontinuous linguistic phrases. On the other hand, it comes up with a new scheme TF-KLD-KNN to learn the discriminative weights of words and phrases specific to paraphrase task, so that a weighted sum of embeddings can represent sentences more effective...

متن کامل

Molecular approaches for detection and identification of foodborne pathogens

Foodborne pathogens comprise microorganisms such as viruses, bacteria and parasites that can be transmitted by food and affect public health worldwide. The most common viruses transmitted via food are hepatitis A virus and Norwalk-like caliciviruses. Also, the most common bacteria involved in foodborne illnesses are Campylobacter jejuni, Clostridium perfringens, Salmonella spp, Escherichia...

متن کامل

Paraphrase Identification by Text Canonicalization

This paper proposes an approach to sentencelevel paraphrase identification by text canonicalization. The source sentence pairs are first converted into surface text that approximates canonical forms. A decision tree learning module which employs simple lexical matching features then takes the output canonicalized texts as its input for a supervised learning process. Experiments on the Microsoft...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: ??????

سال: 2023

ISSN: ['2586-4629', '2765-5407']

DOI: https://doi.org/10.17721/1812-5409.2023/1.10